Comparison of Research Spending on New Drug Approvals by the National Institutes of Health vs the Pharmaceutical Industry, 2010-2019

Key Points Question How does National Institutes of Health (NIH) investment in pharmaceutical innovation compare with investment by the pharmaceutical industry? Findings In this cross-sectional study of 356 drugs approved by the US Food and Drug Administration from 2010 to 2019, the NIH spent $1.44 billion per approval on basic or applied research for products with novel targets or $599 million per approval considering applications of basic research to multiple products. Spending from the NIH was not less than industry spending, with full costs of these investments calculated with comparable accounting. Meaning The results of this cross-sectional study suggest that the relative scale of NIH and industry investment in new drugs may provide a basis for calibrating the balance of social and private returns from these products.

and are termed "project year costs" or "costs." The NIH RePORTER database also describes the project start year, project end year, and PMIDs citing funding from that award.
NIH costs associated with each PMID were identified through the following steps: 1. PMIDs were associated with one (or more) NIH-funded project cited as funding that research using the NIH RePORTER Publication "link tables". Each project is identified by a Project Number comprising the Activity Code for that award, the institute making that award, and a unique identifier number. Data associated with each project includes the project start and end years and costs for each fiscal year, subproject, and supplemental award. Activity codes are described at (https://grants.nih.gov/grants/funding/ac_search_results.htm) and are further characterized by NIH funding program (https://grants.nih.gov/grants/funding/funding_program.htm). 2. For PMIDs with a publication date before the project start year of an associated project, no project year or NIH costs were assigned to the PMID. 3. For PMIDs with a publication date during the term of the award, the project year and project costs associated with the year of publication were assigned to the PMID. 4. For PMIDs published 1-4 years after the project end year, the project year and project costs associated with the end year were assigned to the PMID. This accounts for the estimated 3-year lag between NIH funding in RePORTER and publication dates (Boyack and Jordan 2011). 5. For PMIDs published >4 years after the end year of the project, no project years or costs were assigned. 6. Previously reported sensitivity analysis suggested that this method associates 86.3% of PMID with NIH costs, a fraction consistent with prior descriptions of false positive and negative findings in the RePORTER database (Zhou 2022). 7. Note that project years and costs may be assigned to multiple PMIDs and that each PMID may be represented more than once in the dataset (see above). The resulting duplication is addressed in the final step of analyzing any specific subset of drugs or drug targets (see below). 8. The number of PMIDs, project years, or project year costs associated with each drug were calculated from the number of PMIDs identified in the drug search, the project years associated with those PMIDs, and the costs associated with those project years after eliminating duplicates. This is categorized as applied research. Analyses were performed on all drugs and after elimination of costs beyond the 95 th percentile to outliers resulting from searches contaminated with ambiguous (generic) drug names. For example, clotting factors, hormones, and proteins such as alpha-1 antitrypsin. 9. The number of PMIDs, project years, or project year costs associated with each target were calculated from the number of PMIDs identified in the target search but not the drug search, the project years associated with those PMIDs, and the costs associated with those project years after eliminating duplicates. This is categorized as basic research. Analyses were performed on all drugs and after elimination of costs beyond the 95 th percentile to outliers resulting from searches contaminated with ambiguous target names. For example, CD-4, bcl-2, and EGFR, which are also used as adjectives. 10. Duplicate entries arise from PMID identified in more than one search and Project Years that support multiple PMID. The PMIDs, project years, and costs associated with a subset of drugs or their targets (i.e., individual drugs or targets, first to target drugs, a therapeutic area, or Activity Codes) are determined after eliminating duplicate PMIDs, project years, and costs within that category. Within each subset a PMID is considered applied research if it is associated with a drug search within that subset and is considered basic research if it is associated only with searches for targets within that subset. A project year and its costs are considered applied research if at least one PMID supported by that project year was identified by searching for a drug in that subset and is considered basic research if every PMID supported by that project year was identified by searching for a target in that subset. Note that a PMID, project year or its costs may be represented in multiple subsets.
Analysis was performed in SQL with data in a PostgreSQL database as previously described (Cleary, Beierlein et al. 2018, Cleary, Jackson et al. 2020. Since completion of this work, this method has been replicated in Python code that is freely available at https://github.com/BentleySciIndustry/NIH-Contribution-tophased-clinical-development-of-drugs-approved-Supplemental-Data-Sharing.git. This method has also been described in detail at https://zenodo.org/record/7590163#.Y-LP8nbMJnI. This method can also be accessed through a dashboard at https://www.bentley.edu/centers/center-integration-science-and-industry/nih-fundingdrug-innovation-dashboard Estimating costs with discount rates NIH costs were estimated with 3% and 7% discount rates as recommended by the Office of Management and Budget (OMB 1992, OMB 2017 as well as with a 10.5% value equivalent to the cost of capital used in estimates of industry funding by DiMasi et al. (DiMasi, Grabowski et al. 2016) and Wouters et al. (Wouters, McKee et al. 2020).
For this analysis, 1. NIH costs for research on each drug were calculated with compounded annual discount rates of 3%, 7%, or 10.5% from 2000-2020. This is categorized as applied science. 2. Per drug costs for applied research were calculated as the average of costs for applied research on all drugs in the dataset with discount rates of 3%, 7%, or 10.5%.
3. For first-to-target drugs, NIH costs for research on each drug target was calculated with compounded annual discount rates of 3%, 7%, or 10.5% from 2000-2020. This is categorized as basic science. 4. Per drug NIH costs for basic research were calculated as the average of costs for basic research on first-in-class drugs with discount rates of 3%, 7%, or 10.5%. This represents the average cost of basic research leading to first approval of a drug associated with that target.
Analyses were performed in Excel.

Estimating per drug NIH costs for failed clinical trials
Clinical phase-specific NIH funding for clinical trials of the drugs in this dataset has been described by Zhou et al. (Zhou 2022). In this analysis, PMIDs were identified by searching for the drugs in this dataset also having Publication Types "Clinical Trial," "phase 1," "phase 2," or "phase 3" or having an NCT number in the abstract. PMIDs were assigned to each phase based on the Publication Type or the clinical phase associated with the NCT in clinicaltrials.gov (https://clinicaltrials.gov/ct2/home). Per drug NIH costs for each phase is the sum of the costs associated with PMIDs describing trials at that phase. Sensitivity and specificity statistics for this analysis have been described by Zhou et al. (Zhou 2022).
The clinical phase-specific success rate for products in clinical development were reported by DiMasi et al. (DiMasi, Grabowski et al. 2016). Using these fractions, the number of phase 1, phase 2, and phase 3 trials undertaken to achieve one approval was calculated. The number of failed clinical trials is the average number of trials at each phase minus 1 (eTable 5).
The estimated per drug cost of NIH spending for failed clinical trials is calculated as the product of the number of trials performed at each phase (not counting trials on the approved product) and the average NIH costs for each phase. It should be noted that both the clinical phase success rate described by DiMasi et al. (DiMasi, Grabowski et al. 2016) and the average NIH costs for each phase is not related to the number of individuals trials in clinicaltrials.gov, but rather the number of products proceeding through that phase and the NIH costs incurred.

Estimating spillover effects
Spillover effects are defined as the application of basic research on a drug target to more than one approved product. Santos et al have estimated the number of biological targets for 1,194 approved drugs. The average number of drugs/target was estimated from data in Santos et al. (Santos, Ursu et al. 2017) after eliminating drugs derived from blood or tissue, diagnostic agents, vaccines, and antimicrobials (the exclusion criteria for the present study).
NIH costs in the presence of spillover effects is estimated as total NIH costs categorized as basic research divided by the average number of drugs associated with each known biological target.

Comparison of NIH and industry costs
Average NIH costs were compared to average industry costs reported by DiMasi et al (DiMasi, Grabowski et al. 2016) or Woutres et al (2020. (Wouters, McKee et al. 2020) Drug specific costs were compared for 81 first-in-class drugs with NIH costs estimated in the present dataset and 63 drugs with industry costs described by Wouters et al. (Wouters, McKee et al. 2020) using univariate regression, where Costi is the estimated NIH or industry cost for basic and applied research on the product; and Sourcei is an indicator variable with a value of 0 for NIH and 1 for industry. In this model, β0 described the median and 95% confidence interval for NIH spending and β1 the median and 95% Confidence Interval for the difference between NIH and Industry spending.
Regression analyses were performed in Excel. Aimovig (